Method for adaptive kalman filtering in dynamic systems

ABSTRACT

The invention is based on the use of the principles of Lange&#39;s Fast Kalman-Filtering (FKF™) for large process control, prediction or warning systems where other computing methods are either too slow or fail because of truncation errors. The invented method makes it possible to exploit the FKF method for adaptive Kalman Filtering of dynamic multiparameter systems with large moving data-windows.

TECHNICAL FIELD

This invention relates generally to all practical applications of Kalman filtering and more particularly to controlling dynamic systems with a need for fast and reliable adaptation to circumstances.

BACKGROUND ART

Prior to explaining the invention, it is helpful to understand first the prior art of conventional Kalman recursions as well as the Fast Kalman Filtering (FKF) method for both calibrating a sensor system PCT/FI90/00122 (WO 90/13794) and controlling a large dynamic system PCT/FI93/00192 (WO 93/22625).

The underlying Markov (finite memory) process is described by the equations from (1) to (3). The first equation tells how a measurement vector y_(t) depends on a state vector s_(t) at time point t, (t=0,1,2 . . . ). This is the linearized Measurement (or observation) equation:

y _(t)=H_(t) s _(t) +e _(t)  (1)

Matrix H_(t) is the design (Jacobian) matrix that stems from the partial derivatives of actual physical dependencies. The second equation describes the time evolution of the overall system and it is known as the linearized System (or state) equation:

s _(t)=A_(t) s _(t−1)+B_(t) u _(t−1) +a _(t)  (2)

Matrix A_(t) is the state transition (Jacobian) matrix and B_(t) is the control gain (Jacobian) matrix. Equation (2) tells how present state s_(t) of the overall system develops from previous state s_(t−1), control/external forcing u_(t−1) and random error a_(t) effects. When measurement errors e_(t) and system errors a_(t) are neither auto- (i.e. white noise) nor cross-correlated and are given by the following covariance matrices:

Q_(t)=Cov(a _(t))=E(a _(t) a _(t)′)

and

R_(t)=Cov(e _(t))=E(e _(t) e _(t)′)  (3)

then the famous Kalman forward recursion formulae from (4) to (6) give us Best Linear Unbiased Estimate (BLUE) ŝ_(t) of present state s_(t) as follows:

ŝ _(t)=A_(t) ŝ _(t−1)+B_(t) u _(t−1)+K_(t) {y _(t)−H_(t)(A_(t) ŝ _(t−1)+B_(t) u _(t−1))}  (4)

and the covariance matrix of its estimation errors as follows:

{circumflex over (P)}_(t)=Cov(ŝ _(t))=E{(ŝ _(t) −s _(t))(ŝ _(t) −s _(t))′}=A_(t){circumflex over (P)}_(t−1)A′_(t)+Q_(t)−K_(t)H_(t)(A_(t){circumflex over (P)}_(t−1)A′_(t)+Q_(t))  (5)

where the Kalman gain matrix K_(t) is defined by

K_(t)=(A_(t){circumflex over (P)}_(t−1)A′_(t)+Q_(t))H′_(t){H_(t)(A_(t){circumflex over (P)}_(t−1)A′_(t)+Q_(t))H′_(t)+R_(t)}⁻¹  (6)

This recursive linear solution is (locally) optimal. The stability of the Kalman Filter (KF) requires that the observability and controlability conditions must also be satisfied (Kalman, 1960). However, equation (6) too often requires an overly large matrix to be inverted. Number n of the rows (and columns) of the matrix is as large as there are elements in measurement vector y_(t). A large n is needed for making the observability and controlability conditions satisfied. This is the problem sorted out by the discoveries reported here and in PCT/FI90/00122 and PCT/FI93/00192.

The following modified form of the State equation has been introduced

A_(t) ŝ _(t−1)+B_(t) u _(t−1)=Is _(t)+A_(t)(ŝ _(t−1) −s _(t−1))−a _(t)  (7)

and combined with the Measurement equation (1) in order to obtain the so-called Augmented Model: $\begin{matrix} {{\begin{bmatrix} y_{t} \\ {{A_{t}{\hat{s}}_{t - 1}} + {B_{t}u_{t - 1}}} \end{bmatrix} = {{\begin{bmatrix} H_{t} \\ I \end{bmatrix}s_{t}} + \begin{bmatrix} e_{t} \\ {{A_{t}\left( {{\hat{s}}_{t - 1} - s_{t - 1}} \right)} - a_{t}} \end{bmatrix}}}{{i.e.\quad z_{t}} = {{Z_{t}s_{t}} + \eta_{t}}}} & (8) \end{matrix}$

The state parameters can be estimated by using the well-known solution of a Regression Analysis problem as follows:

ŝ _(t)=(Z′_(t)V_(t) ⁻¹Z_(t))⁻¹Z′_(t)V_(t) ⁻¹ z _(t)  (9)

The result is algebraically equivalent to use of the Kalman recursions but not numerically (see e.g. Harvey, 1981: “Time Series Models”, Philip Allan Publishers Ltd, Oxford, UK, pp. 101-119). The dimension of the matrix to be inverted in equation (9) is now the number (=m) of elements in state vector s_(t). Harvey's approach is fundamental to all different variations of the Fast Kalman Filtering (FKF) method.

An initialization or temporary training of any large Kalman Filter (KF), in order to make the observability condition satisfied, can be done by Lange's High-pass Filter (Lange, 1988). It exploits an analytical sparse-matrix inversion formula for solving regression models with the following so-called Canonical Block-angular matrix structure: $\begin{matrix} {\begin{bmatrix} y_{1} \\ y_{2} \\ \vdots \\ y_{K} \end{bmatrix} = {{\begin{bmatrix} X_{1} & \quad & \quad & \quad & G_{1} \\ \quad & X_{2} & \quad & \quad & G_{2} \\ \quad & \quad & ⋰ & \quad & \vdots \\ \quad & \quad & \quad & X_{K} & G_{K} \end{bmatrix}\quad\begin{bmatrix} b_{1} \\ \vdots \\ b_{K} \\ c \end{bmatrix}} + \begin{bmatrix} e_{1} \\ e_{2} \\ \vdots \\ e_{K} \end{bmatrix}}} & (10) \end{matrix}$

This is the matrix representation of the Measurement equation of e.g. an entire windfinding intercomparison experiment. The vectors b₁,b₂, . . . ,b_(K) typically refer to consecutive position coordinates e.g. of a weather balloon but may also contain those calibration parameters that have a significant time or space variation. The vector c refers to the other calibration parameters that are constant over the sampling period.

For all large multiple sensor systems their design matrices H_(t) are sparse. Thus, one can do in one way or another the same sort of $\begin{matrix} {{{Partitioning}\text{:}\quad s_{t}} = {{\begin{bmatrix} b_{t,1} \\ \vdots \\ b_{t,K} \\ c_{t} \end{bmatrix}y_{t}} = {{\begin{bmatrix} y_{t,1} \\ y_{t,2} \\ \vdots \\ y_{t,K} \end{bmatrix}H_{t}} = \begin{bmatrix} X_{t,1} & \quad & \quad & \quad & G_{t,1} \\ \quad & X_{t,2} & \quad & \quad & G_{t,2} \\ \quad & \quad & ⋰ & \quad & \vdots \\ \quad & \quad & \quad & X_{t,K} & G_{t,K} \end{bmatrix}}}} & \text{(11)} \\ {{{A = {\begin{bmatrix} A_{1} & \quad & \quad & \quad \\ \quad & ⋰ & \quad & \quad \\ \quad & \quad & A_{K} & \quad \\ \quad & \quad & \quad & A_{c} \end{bmatrix}\quad {and}}},{B = \begin{bmatrix} B_{1} & \quad & \quad & \quad \\ \quad & ⋰ & \quad & \quad \\ \quad & \quad & B_{K} & \quad \\ \quad & \quad & \quad & B_{c} \end{bmatrix}}}\quad} & \quad \end{matrix}$

where

c_(t) typically represents calibration parameters at time t

b_(t,k) all other state parameters in the time and/or space volume

A state transition matrix (block-diagonal) at time t

B matrix (block-diagonal) for state-independent effects u_(t) at time t.

If the partitioning is not obvious one may try to do it automatically by using a specific algorithm that converts every sparse linear system into the Canonical Block-angular form (Weil and Kettler, 1971: “Rearranging Matrices to Block-angular Form for Decomposition (and other) Algorithms”, Management Science, Vol. 18, No. 1, Semptember 1971, pages 98-107). The covariance matrix of random errors e_(t) may, however, loose something of its original and simple diagonality.

Consequently, gigantic Regression Analysis problems were faced as follows:

Augmented Model for a space volume case: e.g. for the data of a balloon tracking experiment with K consecutive balloon positions: $\begin{bmatrix} \begin{matrix} y_{t,1} \\ \underset{\_}{{A_{1}{\hat{b}}_{{t - 1},1}} + {B_{1}u_{b_{{t - 1},1}}}} \end{matrix} \\ y_{t,2} \\ \underset{\_}{{A_{2}{\hat{b}}_{{t - 1},2}} + {B_{2}u_{b_{{t - 1},2}}}} \\ \vdots \\ \overset{\_}{\begin{matrix} y_{t,K} \\ \underset{\_}{{A_{K}{\hat{b}}_{{t - 1},K}} + {B_{K}u_{b_{{t - 1},K}}}} \end{matrix}} \\ {{A_{c}{\hat{c}}_{t - 1}} + {B_{c}u_{c_{t - 1}}}} \end{bmatrix} = {{\begin{bmatrix} \begin{matrix} X_{t,1} \\ I \end{matrix} & \quad & \quad & \quad & G_{t,1} \\ \quad & \begin{matrix} X_{t,2} \\ I \end{matrix} & \quad & \quad & G_{t,2} \\ \quad & \quad & ⋰ & \quad & \vdots \\ \quad & \quad & \quad & \begin{matrix} X_{t,K} \\ I \end{matrix} & G_{t,K} \\ \quad & \quad & \quad & \quad & I \end{bmatrix}\begin{bmatrix} b_{t,1} \\ b_{t,2} \\ \vdots \\ b_{t,k} \\ c_{t} \end{bmatrix}} + \begin{bmatrix} e_{t,1} \\ \underset{\_}{{A_{1}\left( {{\hat{b}}_{{t - 1},1} - b_{{t - 1},1}} \right)} - a_{b_{t,1}}} \\ e_{t,2} \\ \underset{\_}{{A_{2}\left( {{\hat{b}}_{{t - 1},2} - b_{{t - 1},2}} \right)} - a_{b_{t,2}}} \\ \vdots \\ \overset{\_}{\begin{matrix} e_{t,K} \\ \underset{\_}{{A_{K}\left( {{\hat{b}}_{{t - 1},K} - b_{{t - 1},K}} \right)} - a_{b_{t,K}}} \end{matrix}} \\ {{A_{c}\left( {{\hat{c}}_{t - 1} - c_{t - 1}} \right)} - a_{c_{t}}} \end{bmatrix}}$

Augmented Model for a moving time volume: (e.g. for “whitening” an observed “innovations” sequence of residuals e_(t) for a moving sample of length L): $\begin{bmatrix} y_{t} \\ \underset{\_}{{A{\hat{s}}_{t - 1}} + {Bu}_{t - 1}} \\ y_{t - 1} \\ \underset{\_}{{A{\hat{s}}_{t - 2}} + {Bu}_{t - 2}} \\ \vdots \\ \overset{\_}{\begin{matrix} y_{t - L + 1} \\ \underset{\_}{{A{\hat{s}}_{t - L}} + {Bu}_{t - L}} \end{matrix}} \\ {{A{\hat{C}}_{t - 1}} + {Bu}_{c_{t - 1}}} \end{bmatrix} = {{\begin{bmatrix} \begin{matrix} H_{t} \\ I \end{matrix} & \quad & \quad & \quad & F_{t} \\ \quad & \begin{matrix} H_{t - 1} \\ I \end{matrix} & \quad & \quad & F_{t - 1} \\ \quad & \quad & ⋰ & \quad & \vdots \\ \quad & \quad & \quad & \begin{matrix} H_{t - L + 1} \\ I \end{matrix} & F_{t - L + 1} \\ \quad & \quad & \quad & \quad & I \end{bmatrix}\begin{bmatrix} s_{t} \\ s_{t - 1} \\ \vdots \\ s_{t - L + 1} \\ c_{t} \end{bmatrix}} + \begin{bmatrix} e_{t} \\ \underset{\_}{{A\left( {{\hat{s}}_{t - 1} - s_{t - 1}} \right)} - a_{t}} \\ e_{t - 1} \\ \underset{\_}{{A\left( {{\hat{s}}_{t - 2} - s_{t - 2}} \right)} - a_{t - 1}} \\ \vdots \\ \overset{\_}{\begin{matrix} e_{t - L + 1} \\ \underset{\_}{{A\left( {{\hat{s}}_{t - L} - s_{t - L}} \right)} - a_{t - L + 1}} \end{matrix}} \\ {{A\left( {{\hat{C}}_{t - 1} - C_{t - 1}} \right)} - a_{c_{t}}} \end{bmatrix}}$

Please note that the latter matrix equation has a “nested” Block-Angular structure. There are two types of “calibration” parameters c_(t) and C_(t). The first set of these parameters, c_(t), can vary from one time to another. The second type, C_(t), of these parameters have constant values (approximately at least) over long moving time windows of length L. The latter ones, C_(t), make the Kalman filtering process an adaptive one. The solving of the latter parameters with the conventional Kalman recursions from (4) to (6) causes an observability problem as for computational reasons length L must be short. But with the FKF formulae of PCT/FI90/00122, the sample size can be so large that no initialization (or training) may be needed at all.

Prior to explaining the method of PCT/FI93/00192, it will be helpful to first understand some prior art of the Kalman Filtering (KF) theory exploited in experimental Numerical Weather Prediction (NWP) systems. As previously, they also make use of equation (1):

Measurement Equation: y _(t)=H_(t) s _(t) +e _(t)  (linearized regression)

where state vector s_(t) describes the state of the atmosphere at time t. Now, s_(t) usually represents all gridpoint values of atmospheric variables e.g. the geopotential heights (actually, their departure values from the actual values estimated by some means) of different pressure levels.

The dynamics of the atmosphere is governed by a well-known set of partial differential equations (“primitive” equations). By using e.g. the tangent linear approximation of the NWP model the following expression of equation (2) is obtained for the time evolution of state parameters s_(t) (actually, their departure values from a “trajectory” in the space of state parameters generated with the nonlinear NWP model) at a time step:

State Equation: s _(t)=As _(t−1)+Bu _(t−1) +a _(t)  (the discretized dyn-stoch.model)

The four-dimensional data assimilation results (ŝ_(t)) and the NWP forecasts ({tilde over (s)}_(t)) respectively, are obtained from the Kalman Filter system as follows:

{tilde over (s)} _(t) ={tilde over (s)} _(t)+K_(t)(y _(t)−H_(t) {tilde over (s)} _(t))

{tilde over (s)} _(t)=A{tilde over (s)} _(t−1)+Bu _(t−1)  (12)

where

P_(t)=Cov({tilde over (s)} _(t))=A Cov(ŝ _(t−1))A′+Q_(t)  (prediction accuracy)

Q_(t)=Cov(a _(t))=Ea _(t) a′ _(t)  (system noise)

R_(t)=Cov(e _(t))=Ee _(t) e′ _(t)  (measurement noise)

and the crucial Updating computations are performed with the following Kalman Recursion:

K_(t)=P_(t)H′_(t)(H_(t)P_(t)H′_(t)+R_(t))⁻¹  (Kalman Gain matrix)

Cov(ŝ _(t))=P_(t)−K_(t)H_(t)P_(t)  (estimation accuracy).

The matrix inversion needed here for the computation of the Kalman Gain matrix is overwhelmingly difficult to compute for a real NWP system because the data assimilation system must be be able to digest several million data elements at a time. Dr. T. Gal-Chen reported on this problem in 1988 as follows: “There is hope that the developments of massively parallel supercomputers (e.g., 1000 desktop CRAYs working in tandem) could result in algorithms much closer to optimal . . . ”, see “Report of the Critical Review Panel—Lower Tropospheric Profiling Symposium: Needs and Technologies”, Bulletin of the American Meteorological Society, Vol. 71, No. 5, May 1990, page 684.

The method of PCT/FI93/00192 exploits the Augmented Model approach from equation (8): $\begin{bmatrix} y_{t} \\ {{A{\hat{s}}_{t - 1}} + {Bu}_{t - 1}} \end{bmatrix} = {{\begin{bmatrix} H_{t} \\ I \end{bmatrix}s_{t}} + \begin{bmatrix} e_{t} \\ {{A\left( {{\hat{s}}_{t - 1} - s_{t - 1}} \right)} - a_{t}} \end{bmatrix}}$ i.e.  z_(t) = Z_(t)s_(t) + η_(t)

The following two sets of equations are obtained for Updating purposes: $\begin{matrix} \begin{matrix} {{\hat{s}}_{t} = {\left( {Z_{t}^{\prime}V_{t}^{- 1}Z_{t}} \right)^{- 1}Z_{t}^{\prime}V_{t}^{- 1}z_{t}\quad \ldots \quad \left( {{{optimal}\quad {estimation}},{{by}\quad {Gauss}\text{-}{Markov}}} \right)}} \\ {{= {\left\{ {{H_{t}^{\prime}R_{t}^{- 1}H_{t}} + P_{t}^{- 1}} \right\}^{- 1}\left( {{H_{t}^{\prime}R_{t}^{- 1}y_{t}} + {P_{t}^{- 1}{\hat{s}}_{t}}} \right)\quad {or}}},} \\ {{= {{\overset{\sim}{s}}_{t} + {{K_{t}\left( {y_{t} - {H_{t}{\overset{\sim}{s}}_{t}}} \right)}\quad \ldots \quad ({alternatively})\quad {and}}}},} \end{matrix} & \text{(13)} \\ \begin{matrix} {{{Cov}\left( {\hat{s}}_{t} \right)} = {{{E\left( {{\hat{s}}_{t} - s_{t}} \right)}\left( {{\hat{s}}_{t} - s_{t}} \right)^{\prime}} = \left( {Z_{t}^{\prime}V^{- 1}Z_{t}} \right)^{- 1}}} \\ {{= {\left\{ {{H_{t}^{\prime}R_{t}^{- 1}H_{t}} + P_{t}^{- 1}} \right\}^{- 1}\quad \ldots \quad \left( {{estimation}\quad {accuracy}} \right)}}\quad} \end{matrix} & \text{(14)} \end{matrix}$

where,

{tilde over (s)} _(t)=Aŝ _(t−1)+Bu _(t−1)  (NWP “forecasting”)

P_(t)=Cov({tilde over (s)} _(t))=A Cov(ŝ _(t−1))A′+Q_(t)  (15)

but instead of

K_(t)=P_(t)H′_(t)(H_(t)P_(t)H′_(t)+R_(t))  (Kalman Gain matrix)

the FKF method of PCT/FI93/00192 takes

K_(t)=Cov(ŝ _(t))H′_(t)R_(t) ⁻¹  (16)

The Augmented Model approach is superior to the use of the conventional Kalman recursions for a large vector of input data y_(t) because the computation of the Kalman Gain matrix K_(t) requires the huge matrix inversion when Cov(ŝ_(t)) is unknown. Both methods are algebraically and statistically equivalent but certainly not numerically.

However, the Augmented Model formulae are still too difficult to be handled numerically. This is, firstly, because state vector s_(t) consists a large amount (=m) of gridpoint data for a realistic representation of the atmosphere. Secondly, there are many other state parameters that must be included in the state vector for a realistic NWP system. These are first of all related to systematic (calibration) errors of observing systems as well as to the so-called physical parameterization schemes of small scale atmospheric processes.

The calibration problems are overcome in PCT/FI93/00192 by using the method of decoupling states. It is done by performing the following Partitioning: $\begin{matrix} {{s_{t} = {{\begin{bmatrix} b_{t,1} \\ \vdots \\ b_{t,K} \\ c_{t} \end{bmatrix}y_{t}} = {{\begin{bmatrix} y_{t,1} \\ t_{t,2} \\ \vdots \\ y_{t,K} \end{bmatrix}H_{t}} = \begin{bmatrix} X_{t,1} & \quad & \quad & \quad & G_{t,1} \\ \quad & X_{t,2} & \quad & \quad & G_{t,2} \\ \quad & \quad & ⋰ & \quad & \vdots \\ \quad & \quad & \quad & X_{t,K} & G_{t,K} \end{bmatrix}}}}{and}} & \text{(17)} \\ {{A_{t} = {\begin{bmatrix} A_{t,1} \\ \vdots \\ A_{t,K} \\ A_{t,c} \end{bmatrix}\quad {and}}},{B_{t} = \begin{bmatrix} B_{t,1} \\ \vdots \\ B_{t,K} \\ B_{t,c} \end{bmatrix}}} & \quad \end{matrix}$

where

c_(t) typically represents “calibration” parameters at time t

b_(t,k) values of atmospheric parameters at gridpoint k (k=1, . . . K)

A state transition matrix at time t (submatrices A₁, . . . ,A_(K),A_(c))

B control gain matrix (submatrices B₁, . . . ,B_(K),B_(c)).

Consequently, the following gigantic Regression Analysis problem was faced: $\begin{matrix} {\begin{bmatrix} y_{t,1} \\ \underset{\_}{{A_{1}{\hat{s}}_{t - 1}} + {B_{1}u_{t - 1}}} \\ y_{t,2} \\ \underset{\_}{{A_{2}{\hat{s}}_{t - 1}} + {B_{2}u_{t - 1}}} \\ \vdots \\ \overset{\_}{\begin{matrix} y_{t,K} \\ \underset{\_}{{A_{K}{\hat{s}}_{t - 1}} + {B_{K}u_{t - 1}}} \end{matrix}} \\ {{A_{c}{\hat{s}}_{t - 1}} + {B_{c}u_{t - 1}}} \end{bmatrix} = {{\begin{bmatrix} \begin{matrix} X_{t - 1} \\ I \end{matrix} & \quad & \quad & \quad & G_{t,1} \\ \quad & \begin{matrix} X_{t,2} \\ I \end{matrix} & \quad & \quad & G_{t,2} \\ \quad & \quad & ⋰ & \quad & \vdots \\ \quad & \quad & \quad & \begin{matrix} X_{t,K} \\ I \end{matrix} & G_{t,K} \\ \quad & \quad & \quad & \quad & I \end{bmatrix}\begin{bmatrix} b_{t,1} \\ b_{t,2} \\ \vdots \\ b_{t,K} \\ c_{t} \end{bmatrix}} + \begin{bmatrix} e_{t,1} \\ \underset{\_}{{A_{1}\left( {{\hat{s}}_{t - 1} - s_{t - 1}} \right)} - a_{b_{t,1}}} \\ e_{t,2} \\ \underset{\_}{{A_{2}\left( {{\hat{s}}_{t - 1} - s_{t - 1}} \right)} - a_{b_{t,2}}} \\ \vdots \\ \overset{\_}{\begin{matrix} e_{t,K} \\ \underset{\_}{{A_{K}\left( {{\hat{s}}_{t - 1} - s_{t - 1}} \right)} - a_{b_{t,K}}} \end{matrix}} \\ {{A_{c}\left( {{\hat{s}}_{t - 1} - s_{t - 1}} \right)} - a_{c_{t}}} \end{bmatrix}}} & (18) \end{matrix}$

The Fast Kalman Filter (FKF) formulae for the recursion step at any time point t were as follows: $\begin{matrix} \begin{matrix} {{{\hat{b}}_{t,k} = {{\left\{ {X_{t,k}^{\prime}V_{t,k}^{- 1}X_{t,k}} \right\}^{- 1}X_{t,k}^{\prime}{V_{t,k}^{\prime}\left( {y_{t,k} - {G_{t,k}{\hat{c}}_{t}}} \right)}\quad {for}\quad k} = 1}},2,\ldots \quad,K} \\ {{\hat{c}}_{t} = {\left\{ {\sum\limits_{k = 0}^{K}\quad {G_{t,k}^{\prime}R_{t,k}G_{t,k}}} \right\}^{- 1}{\sum\limits_{k = 0}^{K}\quad {G_{t,k}^{\prime}R_{t,k}y_{t,k}}}}} \end{matrix} & (19) \end{matrix}$

where, for k=1,2, . . . ,K,

R_(t,k)=V_(t,k) ⁻¹{I−X_(t,k){X′_(t,k)V_(t,k) ⁻¹X_(t,k)}⁻¹X′_(t,k)V_(t,k) ⁻¹}

$V_{t,k} = \begin{bmatrix} {{Cov}\left( e_{t,k} \right)} & \quad \\ \quad & {{Cov}\left\{ {{A_{k}\left( {{\hat{s}}_{t - 1} - s_{t - 1}} \right)} - a_{b_{t,k}}} \right\}} \end{bmatrix}$ $y_{t,k} = \begin{bmatrix} \begin{matrix} \quad & y_{t,k} \end{matrix} \\ {{A_{k}{\hat{s}}_{t - 1}} + {B_{k}u_{t - 1}}} \end{bmatrix}$ $X_{t,k} = \left\lbrack \frac{X_{t,k}}{I} \right\rbrack$ $G_{t,k} = \left\lbrack \frac{G_{t,k}}{\quad} \right\rbrack$

and, i.e. for k=0,

R_(t,0)=V_(t,0) ⁻¹

V_(t,0)=Cov{A_(c)(ŝ _(t−1) −s _(t−1))−a _(c) _(t) }

Y_(t,0)=A_(c) ŝ _(t−1)+B_(c) u _(t−1)

G_(t,0)=I.

The data assimilation accuracies were obtained from equation (20) as follows: $\begin{matrix} {\begin{matrix} {{{Cov}\left( {\hat{s}}_{t} \right)} = {{Cov}\left( {{\hat{b}}_{t,1},\ldots \quad,{\hat{b}}_{t,K},{\hat{c}}_{t}} \right)}} \\ {= \begin{bmatrix} {C_{1} + {D_{1}{SD}_{1}^{\prime}}} & {D_{1}{SD}_{2}^{\prime}} & \ldots & {D_{1}{SD}_{K}^{\prime}} & {{- D_{1}}S} \\ {D_{2}{SD}_{1}^{\prime}} & {C_{2} + {D_{2}{SD}_{2}^{\prime}}} & \quad & {D_{2}{SD}_{K}^{\prime}} & {{- D_{2}}S} \\ \vdots & \quad & ⋰ & \quad & \vdots \\ {D_{K}{SD}_{1}^{\prime}} & {D_{K}{SD}_{2}^{\prime}} & \quad & {C_{K} + {D_{K}{SD}_{K}^{\prime}}} & {{- D_{K}}S} \\ {- {SD}_{1}^{\prime}} & {{- {SD}_{2}^{\prime}}\ldots} & \quad & {- {SD}_{K}^{\prime}} & S \end{bmatrix}} \end{matrix}{where}\begin{matrix} {{C_{k} = {{\left\{ {X_{t,k}^{\prime}V_{t,k}^{- 1}X_{t,k}} \right\}^{- 1}\quad {for}\quad k} = 1}},2,\ldots \quad,K} \\ {{D_{k} = {{\left\{ {X_{t,k}^{\prime}V_{t,k}^{- 1}X_{t,k}} \right\}^{- 1}X_{t,k}^{\prime}V_{t,k}^{- 1}G_{t,k}\quad {for}\quad k} = 1}},2,\ldots \quad,K} \\ {S = \left\{ {\sum\limits_{k = 0}^{K}\quad {G_{t,k}^{\prime}R_{t,k}G_{t,k}}} \right\}^{- 1}} \end{matrix}} & \text{(20)} \end{matrix}$

Kalman Filter (KF) studies have also been reported e.g. by Stephen E. Cohn and David F. Parrish (1991): “The Behavior of Forecast Error Covariances for a Kalman Filter in Two Dimensions”, Monthly Weather Review of the American Meteorological Society, Vol. 119, pp. 1757-1785. However, the ideal Kalman Filter systems described in all such reports is still out of reach for Four Dimensions (i.e. space and time). A reliable estimation and inversion of the error covariance matrix of the state parameters is required as Dr. Heikki Jarvinen of the European Centre for Medium-range Weather Forecasts (ECMWF) states: “In meteorology, the dimension (=m) of the state parameter vector s_(t) may be 100,000-10,000,000. This makes it impossible in practice to exactly handle the error covariance matrix.”, see “Meteorological Data Assimilation as a Variational Problem”, Report No. 43 (1995), Department of Meteorology, University of Helsinki, page 10. Dr. Adrian Simmons of ECMWF confirmes that “the basic approach of Kalman Filtering is well established theoretically, but the computational requirement renders a full implementation intractable.”, see ECMWF Newsletter Number 69 (Spring 1995), page 12.

The Fast Kalman Filtering (FKF) formulas known from PCT/FI90/00122 and PCT/FI93/00192 make use of the assumption that error covariance matrix V_(t) in equations (9) and (13), respectively, is block-diagonal. Please see the FKF formula (19) where these diagonal blocks are written out as: $V_{t,k} = \begin{bmatrix} {{Cov}\left( e_{t,k} \right)} & \quad \\ \quad & {{Cov}\left\{ {{A_{k}\left( {{\hat{s}}_{t - 1} - s_{t - 1}} \right)} - a_{b_{t,k}}} \right\}} \end{bmatrix}$

It is clear especially for the case of adaptive Kalman Filtering (and the 4-dimensional data-assimilation) that the estimates of consecutive state parameter vectors s_(t−1), s_(t−2), s_(t−3), . . . are cross- and auto-correlated.

There exists a need for exploiting the principles of the Fast Kalman Filtering (FKF) method for adaptive Kalman Filtering (AKF) with equal or better computational speed, reliability, accuracy, and cost benefits than other Kalman Filtering methods can do. The invented method for an exact way of handling the error covariances will be disclosed herein.

SUMMARY OF THE INVENTION

These needs are substantially met by provision of the adaptive Fast Kalman Filtering (FKF) method for calibrating/adjusting various parameters of the dynamic system in real-time or in near real-time. Both the measurement and the system errors are whitened and partially orthogonalized as described in this specification. The FKF computations are made as close to the optimal Kalman Filter as needed under the observability and controllability conditions. The estimated error variances and covariances provide a tool for monitoring the filter stability.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1. Diagram of the disclosed invention.

BEST MODE FOR CARRYING OUT THE INVENTION

We rewrite the linearized Measurement (or observation) equation:

y _(t)=H_(t) s _(t)+F_(t) ^(y)C_(t) +e _(t)  (21)

where e_(t) now represents “white” noise that correlates with neither e_(t−1), e_(t−2), . . . nor ŝ_(t−1), ŝ_(t−2), . . . nor a_(t), a_(t−1), a_(t−2), . . . Matrix H_(t) is still the same design matrix as before that stemms from the partial derivatives of the physical dependencies between measurements y_(t) and state parameters s_(t), please see Partitioning (11) on page 3 (the old block-diagonality assumption for matrices A and B is no longer valid). Matrix F_(t) ^(y) describes how the systematic errors of the measurements depend on the calibration or “calibration type” parameters, vector C_(t), that are constants in time or vary only slowly. The columns of matrices F_(t) ^(y), F_(y−1) ^(y), F_(t−2) ^(y), . . . represent partial derivatives, wave forms like sinusoids, square waves, “hat” functions, etc. and empirical orthogonal functions (EOF) depending on what is known about physical dependencies, regression and autoregression (AR) of the systematic errors of the measurements. Elements of estimated vector Ĉ_(t) will then determine amplitudes of the “red” noise factors. Let us refer to quite similar Augmented Model for a moving time volume for “whitening” observed “innovations” sequences of measurements, on the bottom of page 4.

Similarly, we rewrite the linearized System (or state) equation:

s _(t)=(A_(t) +dA_(t))s _(t−1)+B_(t) u _(t−1)+F_(t) ^(s)C_(t) +a _(t)  (22)

where a_(t) now represents “white” noise that correlates with neither e_(t), e_(t−1), e_(t−2), . . . nor ŝ_(t−1), ŝ_(t−2), . . . nor a_(t−1), a_(t−2), . . . . Matrix A_(t) is still the same state transition matrix that stemms from the partial derivatives of the physical dependencies between states s_(t) and previous states s_(t−1). Matrix F_(t) ^(s) describes how the systematic errors of the dynamical model (e.g. NWP) depend on the calibration or “calibration type” parameters, vector C_(t), that are constants in time or vary only slowly. The columns of matrices F_(t) ^(s), F_(t−1) ^(s), F_(t−2) ^(s), . . . represent partial derivatives, wave forms like sinusoids, square waves, “hat” functions, etc. and empirical orthogonal functions (EOF) depending on what is known about physical dependencies, regression and autoregression (AR) of the systematic errors of the model. Elements of estimated vector Ĉ_(t) will then determine amplitudes of the “red” noise factors.

Matrix dA_(t) tells how systematic state transition errors of the dynamical (NWP) model depend on prevailing (weather) conditions. If they are unknown but vary only slowly an adjustment is done by moving averaging (MA) in conjunction with the FKF method as described next. The impact is obtained from System equation (22) and is rewritten as follows:

dA_(t) s _(t−1)=

 =[s ₁I_((m×m)) ,s ₂I_((m×m)) , . . . ,s _(m)I_((m×m)][) da ₁₁ ,da ₂₁ , . . . ,da _(m1) ,da ₁₂ , . . . ,da _(mm)]′

 =M_(t−1) r _(t)  (23)

where

M_(t−1) is a matrix composed of m diagonal matrices of size m×m,

s₁, s₂, . . . s_(m) are the m scalar elements of state vector s_(t−1),

r_(t) is the column vector of all the mxm elements of matrix dA_(t).

Please note that equation (23) reverses the order of multiplication which now makes it possible to estimate elements of matrix dA_(t) as ordinary regression parameters.

Consequently, the following gigantic Regression Analysis problem is faced: Augmented Model for a moving time volume: (i.e. for whitening innovations sequences of residuals e_(t) and a_(t) for a moving sample of length L): $\begin{bmatrix} y_{t} \\ \underset{\_}{{A{\hat{s}}_{t - 1}} + {Bu}_{t - 1}} \\ y_{t - 1} \\ \underset{\_}{{A{\hat{s}}_{t - 2}} + {Bu}_{t - 2}} \\ \vdots \\ \overset{\_}{\begin{matrix} y_{t - L + 1} \\ \underset{\_}{{A{\hat{s}}_{t - L}} + {Bu}_{t - L}} \end{matrix}} \\ {{A{\hat{C}}_{t - 1}} + {Bu}_{c_{t - 1}}} \end{bmatrix} = {{\begin{bmatrix} \begin{matrix} H_{t} \\ I \end{matrix} & \quad & \quad & \quad & \begin{matrix} F_{t}^{y} \\ F_{t}^{s} \end{matrix} & M_{t - 1} \\ \quad & \begin{matrix} H_{t - 1} \\ I \end{matrix} & \quad & \quad & \begin{matrix} F_{t - 1}^{y} \\ F_{t - 1}^{s} \end{matrix} & M_{t - 2} \\ \quad & \quad & ⋰ & \quad & \vdots & \vdots \\ \quad & \quad & \quad & \begin{matrix} H_{t - L + 1} \\ I \end{matrix} & \begin{matrix} F_{t - L + 1}^{y} \\ F_{t - L + 1}^{s} \end{matrix} & M_{t - L} \\ \quad & \quad & \quad & \quad & \quad & I \end{bmatrix}\begin{bmatrix} s_{t} \\ s_{t - 1} \\ \vdots \\ \underset{\_}{s_{t - L + 1}} \\ c_{t} \end{bmatrix}} + \begin{bmatrix} e_{t} \\ \underset{\_}{{A\left( {{\hat{s}}_{t - 1} - s_{t - 1}} \right)} - a_{1}} \\ e_{t - 1} \\ \underset{\_}{{A\left( {{\hat{s}}_{t - 2} - s_{t - 2}} \right)} - a_{t - 1}} \\ \vdots \\ \overset{\_}{\begin{matrix} e_{t - L + 1} \\ \underset{\_}{{A\left( {{\hat{s}}_{t - L} - s_{t - L}} \right)} - a_{t - L + 1}} \end{matrix}} \\ {{A\left( {{\hat{C}}_{t - 1} - C_{t - 1}} \right)} - a_{c_{t}}} \end{bmatrix}}$

Please note that the matrix equation above has a “nested” Block-Angular structure. There can be three types of different “calibration” parameters. The first type, c_(t), is imbedded in the data of each time step t. Two other types are represented by vector C_(t). The first set of these parameters is used for the whitening and the partial ortogonalization of the errors of the measurements and of the system (i.e. for block-diagonalization of the error covariance matrix). The second set, i.e. r_(t), is used for correcting gross errors in the state transition matrices. The last two sets of parameters have more or less constant values over the long moving time window and make the Kalman filtering process an adaptive one.

It should also be noted that matrix M_(t−1) cannot take its full size (m×m²) as indicated in equation (23). This is because the observability condition will become unsatisfied as there would be too many unknown quantities. Thus, matrix M_(t−1) must be effectively “compressed” to represent only those elements of matrix A_(t) which are related to serious state transition errors. Such transitions are found by e.g. using so-called maximum correlation methods. In fact, sporadic and slowly migrating patterns may develop in the space of state parameter vectors. These are small-scale phenomena, typically, and they cannot be adequately described by the state transition matrices derived from the model equations only. In order to maintain the filter stability, all the estimated elements of matrix dA_(t) are kept observable in the moving averaging (MA) of measurements e.g. by monitoring their estimated covariances in equation (20).

The Fast Kalman Filter (FKF) formulae for a time window of length L at time point t are then as follows: $\begin{matrix} \begin{matrix} {{{\hat{s}}_{t - l} = {{\left\{ {X_{t - l}^{\prime}V_{t - l}^{- 1}X_{t - l}} \right\}^{- 1}X_{t - l}^{\prime}{V_{t - l}^{- 1}\left( {y_{t - l} - {G_{t - l}{\hat{c}}_{t}}} \right)}\quad {for}\quad l} = 0}},1,2,\ldots \quad,{L - 1}} \\ {{\hat{c}}_{t} = {\left\{ {\sum\limits_{l = 0}^{L}\quad {G_{t - l}^{\prime}R_{t - l}G_{t - l}}} \right\}^{- 1}{\sum\limits_{l = 0}^{L}\quad {G_{t - l}^{\prime}R_{t - l}y_{t - l}}}}} \end{matrix} & \text{(25)} \end{matrix}$

where, for l=0,1,2, . . . ,L−1,

R_(t−l)=V_(t−l) ⁻¹{I−X_(t−l){X′_(t−l)V_(t−l) ⁻¹X_(t−l)}⁻¹X′_(t−l)V_(t−l) ⁻¹}

$V_{t - l} = \begin{bmatrix} {{Cov}\left( e_{t - l} \right)} & \quad \\ \quad & {{Cov}\left\{ {{A_{t - l}\left( {{\hat{s}}_{t - l - 1} - s_{t - l - 1}} \right)} - a_{t - l}} \right\}} \end{bmatrix}$ $y_{t - l} = \begin{bmatrix} \begin{matrix} \quad & y_{t - l} \end{matrix} \\ {{A_{t - l}{\hat{s}}_{t - l - 1}} + {B_{t - l}u_{t - l - 1}}} \end{bmatrix}$ $X_{t - l} = \left\lbrack \frac{H_{t - l}}{I} \right\rbrack$ $G_{t - l} = \left\lbrack {\frac{F_{t - l}^{y}}{F_{t - l}^{s}\quad}\quad \frac{\quad}{M_{t - l - 1}}} \right\rbrack$

and, i.e. for l=L,

R_(t−L)=V_(t−L) ⁻¹

V_(t−L)=Cov{A_(c)(Ĉ_(t−1)−C_(t−1))−a _(c) _(t) }

y _(t−L)=A_(c)Ĉ_(t−1)+B_(c) u _(c) _(t−1)

G_(t−L)=I.

It may sometimes be necessary to Shape Filter some of the error terms for the sake of optimality. If this is done then the identity (I) matrices would disappear from the FKF formulas and have to be properly replaced accordingly.

The FKF formulas given here and in PCT/FI90/00122 and PCT/FI93/00192 are based on the assumption that error covariance matrices are block-diagonal. Attempts to solve all parameters C_(t) with the conventional Kalman recursions from (4) to (6) doomed to fail due to serious observability and controlability problems as computational restrictions prohibit window length L from being taken long enough. Fortunately, by using the FKF formulas, the time window can be taken so long that an initialization or temporal training sequences of the filter may become completely redundant.

Various formulas for fast adaptive Kalman Filtering can derived from the Normal Equation system of the gigantic linearized regression equation (24) by different recursive uses of Frobenius' formula. $\begin{matrix} {\begin{bmatrix} A & B \\ C & D \end{bmatrix}^{- 1} = \begin{bmatrix} {A^{- 1} + {A^{- 1}{BH}^{- 1}{CA}^{- 1}}} & {{- A^{- 1}}{BH}^{- 1}} \\ {{- H^{- 1}}{CA}^{- 1}} & H^{- 1} \end{bmatrix}} & \text{(26)} \end{matrix}$

where H=D−CA⁻¹B. The formulas (20) and (25) as well as any other FKF type of formulas obtained from Frobenius' formula (26) are pursuant to the invented method.

For example, there are effective computational methods for inverting symmetric band-diagonal matrices. The error covariance matrices of numerical weather forecasts are typically band-diagonal. We can proceed directly from equation system (8) without merging state parameters s into the observational blocks of the gigantic Regression Analysis problem (18). Their error covariance matrix can then be inverted as one large block and a recursive use of Frobenius' formula leads to FKF formulas similar to formulas (25).

All the matrices to be inverted for solution of the gigantic Regression Analysis models are kept small enough by exploiting the reported semi-analytical computational methods. The preferred embodiment of the invention is shown in FIG. 1 and will be described below:

A supernavigator based on a notebook PC that performs the functions of a Kalman filtering logic unit (1) through exploiting the generalized Fast Kalman Filtering (FKF) method. The overall receiver concept comprises an integrated sensor, remote sensing, data processing and transmission system (3) of, say, a national atmospheric/oceanic service and, optionally, an off-the-shelf GPS receiver. The database unit (2) running on the notebook PC contains updated information on control (4) and performance aspects of the various subsystems as well as auxiliary information such as geographical maps. Based upon all these inputs, the logic unit (1) provides real-time 3-dimensional visualizations (5) on what is going on by using the FKF recursions for equation system (24) and on what will take place in the nearest future by using the predictions from equations (15). Dependable accuracy information is also provided when the well-known stability conditions of optimal Kalman filtering are be met by the observing system (3). These error variances and covariances are computed by using equations (15) and (20). The centralized data processing system (3) provides estimates of State Transition Matrix A for each time step t. These matrices are then adjusted locally (1) to take into account all observed small-scale transitions that occur in the atmospheric/oceanic environment (see for example Cotton, Thompson & Mielke, 1994: “Real-Time Mesoscale Prediction on Workstations”, Bulletin of the American Meteorological Society, Vol. 75, Number 3, March 1994, pp. 349-362).

Those skilled in the art will appreciate that many variations could be practiced with respect to the above described invention without departing from the spirit of the invention. Therefore, it should be understood that the scope of the invention should not be considered as limited to the specific embodiment described, except in so far as the claims may specifically include such limitations.

REFERENCES

(1) Kalman, R. E. (1960): “A new approach to linear filtering and prediction problems”. Trans. ASME J. of Basic Eng. 82:35-45.

(2) Lange, A. A. (1982): “Multipath propagation of VLF Omega signals”. IEEE PLANS '82—Position Location and Navigation Symposium Record, December 1982, 302-309.

(3) Lange, A. A. (1984): “Integration, calibration and intercomparison of windfinding devices”. WMO Instruments and Observing Methods Report No. 15.

(4) Lange, A. A. (1988a): “A high-pass filter for optimum calibration of observing systems with applications”. Simulation and Optimization of Large Systems, edited by A. J. Osiadacz, Oxford University Press/Clarendon Press, Oxford, 1988, 311-327.

(5) Lange, A. A. (1988b): “Determination of the radiosonde biases by using satellite radiance measurements”. WMO Instruments and Observing Methods Report No. 33, 201-206.

(6) Lange, A. A. (1990): “Apparatus and method for calibrating a sensor system”. International Application Published under the Patent Cooperation Treaty (PCT), World Intellectual Property Organization, International Bureau, WO 90/13794, PCT/FI90/00122, Nov. 15, 1990.

(7) Lange, A. A. (1993): “Method for Fast Kalman Filtering in large dynamic systems”. International Application Published under the Patent Cooperation Treaty (PCT), World Intellectual Property Organization, International Bureau, WO 93/22625, PCT/FI93/00192, Nov. 11, 1993.

(8) Lange, A. A. (1994): “A surface-based hybrid upper-air sounding system”. WMO Instruments and Observing Methods Report No. 57, 175-177. 

What is claimed is:
 1. A method for adjusting model and calibration parameters of a sensor system accompanied with said model of external events by adaptive Kalman filtering, the sensor output units providing signals in response to said external events and where the series of simultaneously processed sensor output signal values are longer than 50, the method comprising tne steps of: a) providing a data base unit for storing information on: a plurality of test point sensor output signal values for some of said sensors and a plurality of values for said external events corresponding to said test point sensor output signal values, or simultaneous time series of said output signal values from adjacent sensors for comparison; said sensor output signal values accompanied with values for said model and calibration parameters and values for said external events corresponding to a situation; and, controls of said sensors and changes in said external events corresponding to a new situation; b) providing a logic unit for accessing said sensor signal output values with said model and calibration parameters, said logic unit having a two-way commnunications link to said data base unit, and computing initial values for unknown model and calibration parameters with accuracy estimates by using Lange's High-pass Filter if required; c) providing said sensor output signal values from said sensors, as available, to said logic unit; d) providing information on said controls and changes to said data base unit; e) accessing current values of said model and calibration parameters and elements of a state transition matrix, and computing by using the Fast Kalman Filter (FKF) formulas obtained from Frobenius' inversion formula (26) wherein the improvement comprises a diagonalization of the error covariance matrix to be obtained by applying factors F^(y), F^(s) or M to Augmented Model (8), in said logic unit, updates of said model and calibration parameters, values of said external events and their accuracies corresponding to said new situation; f) controlling stability of said Kalman filtering by monitoring said accuracy estimates, in said logic unit, and by indicating when there is need for some of the following: more sensor output signal values, test point data, sensor comparison or system reconfiguration; g) adjusting those of said model and calibration parameter values for which stable updates are available. 